Conditional Swap Regret and Conditional Correlated Equilibrium

نویسندگان

  • Mehryar Mohri
  • Scott Yang
چکیده

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player’s action history. We prove a series of new results for conditional swap regret minimization. We present algorithms for minimizing conditional swap regret with bounded conditioning history. We further extend these results to the case where conditional swaps are considered only for a subset of actions. We also define a new notion of equilibrium, conditional correlated equilibrium, that is tightly connected to the notion of conditional swap regret: when all players follow conditional swap regret minimization strategies, then the empirical distribution approaches this equilibrium. Finally, we extend our results to the multi-armed bandit scenario.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Learning with Transductive Regret

We study online learning with the general notion of transductive regret, that is regret with modification rules applying to expert sequences (as opposed to single experts) that are representable by weighted finite-state transducers. We show how transductive regret generalizes existing notions of regret, including: (1) external regret; (2) internal regret; (3) swap regret; and (4) conditional sw...

متن کامل

Learning to play partially-specified equilibrium

In a partially-specified correlated equilibrium (PSCE ) the players are partially informed of the conditional strategies of the other players, and they best respond to the worst-case possible strategy. We construct a decentralized procedure that converges to PSCE when the monitoring is imperfect. This procedure is based on minimizing conditional regret when players obtain noisy signals that dep...

متن کامل

CS364A: Algorithmic Game Theory Lecture #19: Pure Nash Equilibria and PLS-Completeness∗

1 The Big Picture We now have an impressive list of tractability results — polynomial-time algorithms and quickly converging learning dynamics — for several equilibrium concepts in several classes of games. Such tractability results, especially via reasonably natural learning processes, lend credibility to the predictive power of these equilibrium concepts. See also Figure 1. [Lecture 17] In ge...

متن کامل

Regret Minimization in Games with Incomplete Information

Extensive games are a powerful model of multiagent decision-making scenarioswith incomplete information. Finding a Nash equilibrium for very large instancesof these games has received a great deal of recent attention. In this paper, wedescribe a new technique for solving large games based on regret minimization.In particular, we introduce the notion of counterfactual regret, whi...

متن کامل

CS 364 A : Algorithmic Game Theory Lecture # 18 : From External Regret to Swap Regret and the Minimax Theorem ∗

Last lecture we proved that coarse correlated equilibria (CCE) are tractable, in a satisfying sense: there are simple and computationally efficient learning procedures that converge quickly to the set of CCE. Of course, if anything in our equilibrium hierarchy (Figure 1) was going to be tractable, it was going to be CCE, the biggest set. The good researcher is never satisfied and always seeks s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014